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Classical  Computational  Dynamics, 
Constrained  Equations  of  Motion 
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Generalized  Positions - 

Generalized  Mass  Matrix - 


Velocity  Transformation  Matrix 
Generalized  Velocities 


q=L(q)v 


Reaction 

Force 

I  *  I 


M(q)v  =  f  (/,q,  v)  -  gj  (q,/)X 

' - ^  q 

Applied  Force 

g(q,0  =  o 
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Multibody  Dynamics:  Is  anything 
left  to  do? 
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•  Purpose:  understand/optimize  performance  before  building  prototype 
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left  to  do? 
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All  the  good  music  has  already  been  written  by  people  with  wigs  and  stuff. 

Frank  Zappa 
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Frictional  Contact  Simulation 

[Commercial  Solution] 


•  Model  Parameters: 

-  Spheres:  60  mm  diameter  and  mass  0.882  kg 

-  Forces:  smoothing  with  stiffness  of  1 E5,  force 
exponent  of  2.2,  damping  coefficient  of  1 0.0, 
and  a  penetration  depth  of  0.1 

-  Simulation  length:  3  seconds 
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•  How  is  the  Rover  moving  along  on  a  slope  with  granular  material? 

•  What  wheel  geometry  is  more  effective? 


Multibody  Dynamics:  Lots  to  be  done... 
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•  Applications  transitioning  from  multi-body  to  many-body  dynamics 

•  Bodies  interacting  through  friction/contact/impact 

•  Bodies  are  compliant,  sometimes  undergo  large  deformations 

•  Bodies  might  interact  with  fluid  (FSI) 

•  Tomorrow’s  problems  are  in  the  realm  of  multi-phsyics 
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Simulating  large  engineering  problems 
remains  a  challenge... 
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CPU/GPU  Node  Architecture 


File  Server  Architecture 


Legend,  Connection  Type: 


RAM  16  GB  DDR3 


Infiniband 

HCA 


^Gigabit  Ethernet1 


— 4x  QDR  Infiniband- 


RAM 

48  GB  DDR3 


1.5GB  RAM 
448  Cores 
PCIEx16  2.0 


Remote 

Collaborators 


AMD  Node  Architecture 


Internal  Users 


CPU  0 

AMD  Opteron  6276 


CPU  2 

AMD  Opteron  6276 


CPU  1 

AMD  Opteron  6276 


CPU  3 

AMD  Opteron  6276 


Internet 


Infiniband 

HCA 


RAM 

128  GB  DDR3 


Gigabit  Ethernet 
Switch 


4x  QDR  Infiniband 
Switch 


Head  Node 


CPU/GPU  Node  1  CPU/GPU  Node  2 


CPU/GPU  Node  14 


File  Server 


Lab’s  Research  Heterogeneous 
Computing  Cluster 


Infiniband 

HCA 

RAID  6 

24x  2TB  Hard  Disks 
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Lab’s  Research  Heterogeneous 
Computing  Cluster 
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•  More  than  25,000  GPU  scalar  processors 

-  Can  manage  about  75,000  GPU  parallel  threads  at  full  capacity 

•  More  than  1 000  CPU  cores 

•  Mellanox  Infiniband  Interconnect,  40Gb/sec 

•  About  0.7  TB  of  RAM 

•  More  than  20  Tflops  DP 


The  issues  is  not  hardware  availability.  Rather,  it  is  producing  modeling  and 
solution  techniques  that  can  leverage  this  hardware 
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Heterogeneous  Computing  Template  (HCT): 

A  Research-Grade  Software  Infrastructure 

MuuELING  rno  SIMULHTmN,  TESTING  hnd  VHLIDHTION 

for  Large  Scale  Computational  Dynamics  Simulation 
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•  Goal,  lab’s  research  effort:  shape  up  the  future  of  physics-based  simulation 

-  Develop  a  Heterogeneous  Computing  Template  (HCT)  that  leverages  emerging 
hardware  architectures  and  suitable  algorithms  to  solve  open  engineering  problems 


•  Targeted  “emerging  hardware  architectures” : 

-  Clusters  of  CPUs  and  GPUs  (accelerators) 

•  More  than  1 00  CPU  cores,  tens  of  GPU  cards,  tens  of  thousands  of  GPU  cores 


•  Focus  on  “open  engineering  problems” 

-  Vehicle  mobility,  granular  dynamics,  soil  modeling,  tire/terrain  modeling,  FSI,  etc. 
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HCT:  Five  Major  Components 
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•  Computational  Dynamics  requires 

-  Advanced  modeling  techniques 

-  Strong  algorithmic  (applied  math)  support 

-  Proximity  computation 

-  Domain  decomposition  &  Inter-domain  data  exchange 

-  Post-processing  (visualization) 


•  HCT  represents  the  library  support,  the  associated  API,  and  the 
embedded  tools  that  support  this  five  component  abstraction 
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•  Multi-Physics  targeted  Computational  Dynamics  requires 

-  Advanced  modeling  techniques 

-  Strong  algorithmic  (applied  math)  support 

-  Proximity  computation 

-  Domain  decomposition  &  Inter-domain  data  exchange 

-  Post-processing  (visualization) 
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HCT:  Support  for  Advanced  Modelin 
Techniques 
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•  Modeling:  what  does  it  mean? 

-  The  process  of  formulating  a  set  of  governing  differential  equations  that  captures  the 
multi-physics  associated  with  the  engineering  problem  of  interest 


•  Modeling  decisions  are  consequential 

-  Good  modeling  places  you  at  an  advantage  when  it  comes  to  simulating  hard  problems 
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Traditional  Discretization  Scheme 
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i  e  A(q{l\S)  :  0  q(/))V  D?nv(/+1)Q^ n  >  °» 

^ _ 


Stabilization 

term 


Complementarity 

Condition 


cirgmin  V  ~^i,u  H-  1fi,w 


Coulomb  3D  fricion 
model 
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The  Cone  Complementarity 
Problem  (CCP) 
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•  First  order  optimality  conditions  lead  to  Cone  Complementarity  Problem 


•  Introduce  the  convex  hypercone... 

...  and  its  polar  hypercone: 

T  =  (  ©  TC \ 

O 

II 

© 

<s>. 

O 

\ieA(  ql,e)  / 

TCl  e  R3  represents  friction  cone  associated  with  ith  contact 


CCP  assumes  following  form:  Find  y  such  that 
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The  Quadratic  Programming  Angle... 
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•  The  relaxed  EOM  represent  a  cone-complementarity  problem  (CCP) 

•  The  CCP  captures  the  first-order  optimality  condition  for  a  quadratic 
optimization  problem  with  conic  constraints: 

(  minq(7)  =  ^7TN7  +  dT7 
I  subject  to  Glj  for  i  =  1,2,...,  Nc 


Notation  used: 


7^[7iT,^-,7Ur^ 


T  lT 


)3  xNr 


and 


Ti  :  (lu,i  +  7 w,i)  -  »hl,i  <  0 
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CCP  Solution  Algorithm 

[mapped  on  the  GPU] 
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rj~\  1 

1 .  For  each  contact  i,  evaluate  r\i  =  3/Trace(  M-1  D*). 


2.  If  some  initial  guess  7*  is  available  for  multipliers,  then  set  70  =7*,  otherwise  70  =  0. 

3.  Initialize  velocities:  v°  =  M-1  +  M_1k  . 


4.  For  each  contact  i,  compute  changes  in  multipliers  for  contact  constraints: 

7 r+1  =  A  nTi  (7 [  -  urn  ( Dfvr  +  bi))  +  (1  -  A)7[  ; 
A7’’"1"1  =  7,r+1  -  il ; 

Av;  =  M-1D.,A7tr+1. 

5.  Apply  updates  to  the  velocity  vector: 

vr+l  =  vr  +  £  •  AVi 

6  r  ;=  r  +  1.  Repeat  from  4  until 

convergence  or  r  >  rmax 


14-16  AUG  2012 


UNCLASSIFIED 


GVSETS 


'VfQJtm  3 


•  Multi-Physics  targeted  Computational  Dynamics  requires 

-  Advanced  modeling  techniques 

-  Strong  algorithmic  (applied  math)  support 

-  Proximity  computation 

-  Domain  decomposition  &  Inter-domain  data  exchange 

-  Post-processing  (visualization) 
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1  Million  Rigid  Spheres 

[parallel  on  the  GPU] 
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Objective  Function  Value 

[1 K  bodies,  3525  contacts] 
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The  green  &  blue  lines  have 
100  dots  on  them;  i.e.,100 
changes  of  active  set 


Method 

Iterations 

Final  Objective 
Function  Value 

Tmin 

Ymax 

Computation  Time 
[sec] 

GPMINRES-no  p 

1000  MinRes  Its.  [within 
100  changes  of  active  setl 

-2.9035 

0.0 

7.7487 

6.7002 

GPMINRES-no  p 
(not  plotted  above) 

10000  MinRes  Its.  [within 
1000  changes  of  active  setl 

-2.9045 

0.0 

8.2002 

61.0698 

GPMINRES-p 

100  MinRes  Its.  [within  100 
changes  of  active  set] 

-2.8854 

0.0 

6.8551 

1675 

Jacobi 

1000 

-2.5077 

0.0 

4.4961 

3.6643 

in 

ullsf  TQ 
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•  Multi-Physics  targeted  Computational  Dynamics  requires 

-  Advanced  modeling  techniques 

-  Strong  algorithmic  (applied  math)  support 

-  Proximity  computation 

-  Domain  decomposition  &  Inter-domain  data  exchange 

-  Post-processing  (visualization) 
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Si  :•:!  :  : :  #S 


-  Ill  li 

:p'i 

mmmm 

i  i.i  iV 

I,  .M.i.ri'i 

Oil 

1  i  i*.*  1 1  i 

i  i  » 

■  : 

■iisi 
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'i> 


d  =  P,-P2=(^-Mi+^-M2)c  +  (bi-b2) 


cd  SPj  dP2 


d2d 


d2Vx 


a2p. 


da.  da-  da.  da-da.  da-da-  da-da- 

i  i  i  i  j  i  j  i  j 


A :  Rotation  Matrix 

M  =  ARA 

R  =  diag(r{,r2,r:J 
b :  Translation  of 
ellipsoids  center 

A2  =-nTMn 

4 


d  =  P,-P2 


min  !£/(£*,,  £*2)||2 

a\  ,«2 
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Collision  Detection 
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•  Broad  phase 

-  Draws  on  an  Axis  Aligned  Bounding  Box  (AABB)  approach 

•  Narrow  phase 

-  Draws  on  Minkowski  Portal  Refinement 
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Multiple-GPU  Collision  Detection 
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Assembled  Quad  GPU  Machine 
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Processor:  AMD  Phenom  II  X4  940  Black 
Memory:  16GB  DDR2 
Graphics:  4x  NVIDIA  Tesla  Cl  060 
Power  supply  1 : 1 000W 
Power  supply  2:  750W 
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Software/Hardware  Setup 


Open  MP 


Thread  Thread  Thread  Thread 
0  12  3 

Wa 


16  GB  RAM 


Quad  Core  AMD 
Microprocessor 


[ 


CUDA 


Tesla  C1060 
4x4  GB  Memory 
4x30720  threads 
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fi  Spheres  -  Contacts  vs.  Time 
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Quad  Tesla  C1060  Configuration 

200 


0  1  2  3  4  5  6 

Contacts  (Billions) 
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Speedup  -  GPU  vs.  CPU  (Bullet 

[results  reported  are  for  spheres] 
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•  Multi-Physics  targeted  Computational  Dynamics  requires 

-  Advanced  modeling  techniques 

-  Strong  algorithmic  (applied  math)  support 

-  Proximity  computation 

-  Domain  decomposition  &  Inter-domain  data  exchange 

-  Post-processing  (visualization) 
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h  =  .0001  [s] 


g  =  —9.80665 

20 k  spheres 
r  =  3.5  mm 
p  =  .46 

rad 


m 


Cj)  —  TC 


sec 


Anchor  width  =  5  [cm] 
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200,000  Bodies  &  10  kg  Anchor 
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Anchor  Penetration  Depth, 
Function  of  Applied  Torque 


Anchor  Depth  vs  Time 


-2 


0123456789  10 

Time  (sec) 
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Depth  as  a  Function  of  Pulling 
Force 
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Anchor  Depth  vs  Time 


-2 


0123456789  10 

Time  (sec) 
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Depth  (m) 


Depth  as  a  Function  of  Pulling 
Force 


Anchor  Depth  vs  Time 


Time  (sec) 
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Track  Simulation 


Parameters: 


•  Driving  speed:  1 .0  rad/sec 

•  Length:  12  seconds 

•  Time  step:  0.005  sec 


•  Computation  time:  18.5  hours 

•  Particle  radius:  .027273  m 

•  Terrain:  284,715  particles 

•Inertia  parameters  of  track  are 
fake 
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Dual  Track  ‘Footprint’  MSTV 
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In  theory,  there  is  no  difference  between  theory  and  practice.  In  practice,  there  is. 

Yogi  Bera 
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Real  Masses  for  Both  Obstacles 
and  Terrain... 
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Vehicle-Track-Terrain  Interaction 
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Force  Vs  Time 
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Vehicle-Track-Terrain  Interaction 
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Force  vs  Time 
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Vehicle-Track-Terrain  Interaction 
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Force  vs  Time 


4  4.1  4.2  4.3  4.4  4.5  4.6  4.7  4.8  4.9  5 

Time  (sec) 
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Conclusions/Putting  Things  in 
Perspective 
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•  Goal:  investigate  how  computing  can  catalyze  over  the  next  10  years 
advances  in  Science  and  innovation  in  Engineering 


•  Reaching  the  goal. . . 

-  Develop  an  experimentally  validated  Heterogeneous  Computing  Template  (HCT) 

-  Use  HCT  to  advance  state  of  the  art  in  physics-based  simulation 
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Thank  You. 


nearut@wisc.edu 

http://sbel.wisc.edu 

University  of  Wisconsin-Madison 
Simulation-Based  Engineering  Lab 
Wisconsin  Applied  Computing  Center 

More  Animations  at: 
http://sbel.wisc.edu/Animations/ 
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